Dataset statistics
| Number of variables | 20 |
|---|---|
| Number of observations | 64068 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 30.0 MiB |
| Average record size in memory | 490.6 B |
Variable types
| Numeric | 12 |
|---|---|
| DateTime | 1 |
| Categorical | 7 |
Company has constant value "Pink Cab" | Constant |
df_index is highly correlated with Transaction ID and 1 other fields | High correlation |
Transaction ID is highly correlated with df_index and 1 other fields | High correlation |
KM Travelled is highly correlated with Price Charged and 1 other fields | High correlation |
Price Charged is highly correlated with KM Travelled and 1 other fields | High correlation |
Cost of Trip is highly correlated with KM Travelled and 1 other fields | High correlation |
Year is highly correlated with df_index and 1 other fields | High correlation |
Company is highly correlated with Year and 5 other fields | High correlation |
Year is highly correlated with Company | High correlation |
City is highly correlated with Company | High correlation |
Holiday is highly correlated with Company | High correlation |
Day of Week is highly correlated with Company | High correlation |
Payment_Mode is highly correlated with Company | High correlation |
Gender is highly correlated with Company | High correlation |
df_index has unique values | Unique |
Transaction ID has unique values | Unique |
Reproduction
| Analysis started | 2021-02-27 19:45:57.261043 |
|---|---|
| Analysis finished | 2021-02-27 19:46:37.043638 |
| Duration | 39.78 seconds |
| Software version | pandas-profiling v2.10.1 |
| Download configuration | config.yaml |
| Distinct | 64068 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 183900.9572 |
|---|---|
| Minimum | 8 |
| Maximum | 359390 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 500.7 KiB |
Quantile statistics
| Minimum | 8 |
|---|---|
| 5-th percentile | 22632.7 |
| Q1 | 91735.75 |
| median | 186113.5 |
| Q3 | 275792 |
| 95-th percentile | 342569.3 |
| Maximum | 359390 |
| Range | 359382 |
| Interquartile range (IQR) | 184056.25 |
Descriptive statistics
| Standard deviation | 103392.1525 |
|---|---|
| Coefficient of variation (CV) | 0.5622165 |
| Kurtosis | -1.204769336 |
| Mean | 183900.9572 |
| Median Absolute Deviation (MAD) | 92085 |
| Skewness | -0.02088681297 |
| Sum | 1.178216652 × 1010 |
| Variance | 1.06899372 × 1010 |
| Monotocity | Strictly increasing |
| Value | Count | Frequency (%) |
| 264191 | 1 | < 0.1% |
| 160475 | 1 | < 0.1% |
| 209651 | 1 | < 0.1% |
| 80626 | 1 | < 0.1% |
| 207600 | 1 | < 0.1% |
| 119535 | 1 | < 0.1% |
| 246509 | 1 | < 0.1% |
| 258795 | 1 | < 0.1% |
| 316348 | 1 | < 0.1% |
| 256744 | 1 | < 0.1% |
| Other values (64058) | 64058 |
| Value | Count | Frequency (%) |
| 8 | 1 | |
| 9 | 1 | |
| 14 | 1 | |
| 19 | 1 | |
| 30 | 1 | |
| 45 | 1 | |
| 48 | 1 | |
| 54 | 1 | |
| 55 | 1 | |
| 65 | 1 |
| Value | Count | Frequency (%) |
| 359390 | 1 | |
| 359389 | 1 | |
| 359388 | 1 | |
| 359386 | 1 | |
| 359385 | 1 | |
| 359384 | 1 | |
| 359369 | 1 | |
| 359368 | 1 | |
| 359364 | 1 | |
| 359362 | 1 |
| Distinct | 64068 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10224673.33 |
|---|---|
| Minimum | 10000011 |
| Maximum | 10437212 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 500.7 KiB |
Quantile statistics
| Minimum | 10000011 |
|---|---|
| 5-th percentile | 10027801.35 |
| Q1 | 10112718.75 |
| median | 10226083.5 |
| Q3 | 10338269.25 |
| 95-th percentile | 10416460.65 |
| Maximum | 10437212 |
| Range | 437201 |
| Interquartile range (IQR) | 225550.5 |
Descriptive statistics
| Standard deviation | 126258.8704 |
|---|---|
| Coefficient of variation (CV) | 0.01234845029 |
| Kurtosis | -1.209315365 |
| Mean | 10224673.33 |
| Median Absolute Deviation (MAD) | 113136 |
| Skewness | -0.02101934193 |
| Sum | 6.550743711 × 1011 |
| Variance | 1.594130235 × 1010 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 10105962 | 1 | < 0.1% |
| 10087976 | 1 | < 0.1% |
| 10051622 | 1 | < 0.1% |
| 10301629 | 1 | < 0.1% |
| 10124635 | 1 | < 0.1% |
| 10004363 | 1 | < 0.1% |
| 10416016 | 1 | < 0.1% |
| 10120508 | 1 | < 0.1% |
| 10124654 | 1 | < 0.1% |
| 10407549 | 1 | < 0.1% |
| Other values (64058) | 64058 |
| Value | Count | Frequency (%) |
| 10000011 | 1 | |
| 10000012 | 1 | |
| 10000013 | 1 | |
| 10000014 | 1 | |
| 10000015 | 1 | |
| 10000016 | 1 | |
| 10000017 | 1 | |
| 10000018 | 1 | |
| 10000019 | 1 | |
| 10000020 | 1 |
| Value | Count | Frequency (%) |
| 10437212 | 1 | |
| 10437198 | 1 | |
| 10437196 | 1 | |
| 10437194 | 1 | |
| 10437193 | 1 | |
| 10437191 | 1 | |
| 10437190 | 1 | |
| 10437189 | 1 | |
| 10437188 | 1 | |
| 10437187 | 1 |
Date of Travel
Date
| Distinct | 1095 |
|---|---|
| Distinct (%) | 1.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 500.7 KiB |
| Minimum | 2016-01-02 00:00:00 |
|---|---|
| Maximum | 2018-12-31 00:00:00 |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.0 MiB |
| Pink Cab |
|---|
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 8 |
| Min length | 8 |
Characters and Unicode
| Total characters | 512544 |
|---|---|
| Distinct characters | 8 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Pink Cab |
|---|---|
| 2nd row | Pink Cab |
| 3rd row | Pink Cab |
| 4th row | Pink Cab |
| 5th row | Pink Cab |
| Value | Count | Frequency (%) |
| Pink Cab | 64068 |
| Value | Count | Frequency (%) |
| pink | 64068 | |
| cab | 64068 |
Most occurring characters
| Value | Count | Frequency (%) |
| P | 64068 | |
| i | 64068 | |
| n | 64068 | |
| k | 64068 | |
| 64068 | ||
| C | 64068 | |
| a | 64068 | |
| b | 64068 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 320340 | |
| Uppercase Letter | 128136 | 25.0% |
| Space Separator | 64068 | 12.5% |
Most frequent character per category
| Value | Count | Frequency (%) |
| i | 64068 | |
| n | 64068 | |
| k | 64068 | |
| a | 64068 | |
| b | 64068 |
| Value | Count | Frequency (%) |
| P | 64068 | |
| C | 64068 |
| Value | Count | Frequency (%) |
| 64068 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 448476 | |
| Common | 64068 | 12.5% |
Most frequent character per script
| Value | Count | Frequency (%) |
| P | 64068 | |
| i | 64068 | |
| n | 64068 | |
| k | 64068 | |
| C | 64068 | |
| a | 64068 | |
| b | 64068 |
| Value | Count | Frequency (%) |
| 64068 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 512544 |
Most frequent character per block
| Value | Count | Frequency (%) |
| P | 64068 | |
| i | 64068 | |
| n | 64068 | |
| k | 64068 | |
| 64068 | ||
| C | 64068 | |
| a | 64068 | |
| b | 64068 |
| Distinct | 15 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.2 MiB |
| LOS ANGELES CA | |
|---|---|
| NEW YORK NY | |
| CHICAGO IL | |
| BOSTON MA | |
| MIAMI FL | |
| Other values (10) |
Length
| Max length | 14 |
|---|---|
| Median length | 11 |
| Mean length | 11.49781482 |
| Min length | 8 |
Characters and Unicode
| Total characters | 736642 |
|---|---|
| Distinct characters | 25 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | PHOENIX AZ |
|---|---|
| 2nd row | PHOENIX AZ |
| 3rd row | NEW YORK NY |
| 4th row | LOS ANGELES CA |
| 5th row | CHICAGO IL |
| Value | Count | Frequency (%) |
| LOS ANGELES CA | 19865 | |
| NEW YORK NY | 13967 | |
| CHICAGO IL | 9361 | |
| BOSTON MA | 5186 | 8.1% |
| MIAMI FL | 2002 | 3.1% |
| AUSTIN TX | 1868 | 2.9% |
| NASHVILLE TN | 1841 | 2.9% |
| ATLANTA GA | 1762 | 2.8% |
| ORANGE COUNTY | 1513 | 2.4% |
| DENVER CO | 1394 | 2.2% |
| Other values (5) | 5309 | 8.3% |
| Value | Count | Frequency (%) |
| ca | 22248 | |
| angeles | 19865 | |
| los | 19865 | |
| new | 13967 | |
| york | 13967 | |
| ny | 13967 | |
| chicago | 9361 | 5.7% |
| il | 9361 | 5.7% |
| boston | 5186 | 3.2% |
| ma | 5186 | 3.2% |
| Other values (20) | 30044 |
Most occurring characters
| Value | Count | Frequency (%) |
| 98949 | ||
| A | 78955 | |
| N | 67964 | |
| E | 63086 | |
| O | 61232 | |
| L | 59297 | |
| S | 53070 | 7.2% |
| C | 45211 | 6.1% |
| G | 34232 | 4.6% |
| Y | 29447 | 4.0% |
| Other values (15) | 145199 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 637693 | |
| Space Separator | 98949 | 13.4% |
Most frequent character per category
| Value | Count | Frequency (%) |
| A | 78955 | |
| N | 67964 | |
| E | 63086 | |
| O | 61232 | |
| L | 59297 | |
| S | 53070 | |
| C | 45211 | 7.1% |
| G | 34232 | 5.4% |
| Y | 29447 | 4.6% |
| I | 29030 | 4.6% |
| Other values (14) | 116169 |
| Value | Count | Frequency (%) |
| 98949 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 637693 | |
| Common | 98949 | 13.4% |
Most frequent character per script
| Value | Count | Frequency (%) |
| A | 78955 | |
| N | 67964 | |
| E | 63086 | |
| O | 61232 | |
| L | 59297 | |
| S | 53070 | |
| C | 45211 | 7.1% |
| G | 34232 | 5.4% |
| Y | 29447 | 4.6% |
| I | 29030 | 4.6% |
| Other values (14) | 116169 |
| Value | Count | Frequency (%) |
| 98949 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 736642 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 98949 | ||
| A | 78955 | |
| N | 67964 | |
| E | 63086 | |
| O | 61232 | |
| L | 59297 | |
| S | 53070 | 7.2% |
| C | 45211 | 6.1% |
| G | 34232 | 4.6% |
| Y | 29447 | 4.0% |
| Other values (15) | 145199 |
| Distinct | 874 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 22.55923987 |
|---|---|
| Minimum | 1.9 |
| Maximum | 48 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 500.7 KiB |
Quantile statistics
| Minimum | 1.9 |
|---|---|
| 5-th percentile | 3.5805 |
| Q1 | 12 |
| median | 22.44 |
| Q3 | 32.86 |
| 95-th percentile | 42 |
| Maximum | 48 |
| Range | 46.1 |
| Interquartile range (IQR) | 20.86 |
Descriptive statistics
| Standard deviation | 12.21873606 |
|---|---|
| Coefficient of variation (CV) | 0.5416288906 |
| Kurtosis | -1.126642246 |
| Mean | 22.55923987 |
| Median Absolute Deviation (MAD) | 10.44 |
| Skewness | 0.05570736882 |
| Sum | 1445325.38 |
| Variance | 149.297511 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 33.6 | 271 | 0.4% |
| 39.6 | 202 | 0.3% |
| 22.8 | 195 | 0.3% |
| 24 | 195 | 0.3% |
| 35.7 | 190 | 0.3% |
| 37.44 | 185 | 0.3% |
| 16.8 | 183 | 0.3% |
| 28.08 | 180 | 0.3% |
| 27 | 152 | 0.2% |
| 42.18 | 150 | 0.2% |
| Other values (864) | 62165 |
| Value | Count | Frequency (%) |
| 1.9 | 50 | |
| 1.92 | 63 | |
| 1.94 | 57 | |
| 1.96 | 75 | |
| 1.98 | 68 | |
| 2 | 63 | |
| 2.02 | 56 | |
| 2.04 | 61 | |
| 2.06 | 68 | |
| 2.08 | 76 |
| Value | Count | Frequency (%) |
| 48 | 62 | |
| 47.6 | 50 | 0.1% |
| 47.2 | 57 | 0.1% |
| 46.8 | 143 | |
| 46.41 | 65 | |
| 46.4 | 61 | |
| 46.02 | 59 | |
| 46 | 57 | 0.1% |
| 45.63 | 73 | |
| 45.6 | 135 |
| Distinct | 40501 |
|---|---|
| Distinct (%) | 63.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 311.0193337 |
|---|---|
| Minimum | 15.6 |
| Maximum | 1623.48 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 500.7 KiB |
Quantile statistics
| Minimum | 15.6 |
|---|---|
| 5-th percentile | 48.8435 |
| Q1 | 159.8325 |
| median | 297.6 |
| Q3 | 440.75 |
| 95-th percentile | 625.7195 |
| Maximum | 1623.48 |
| Range | 1607.88 |
| Interquartile range (IQR) | 280.9175 |
Descriptive statistics
| Standard deviation | 183.0481327 |
|---|---|
| Coefficient of variation (CV) | 0.5885426172 |
| Kurtosis | -0.08453434762 |
| Mean | 311.0193337 |
| Median Absolute Deviation (MAD) | 140.4 |
| Skewness | 0.4909935408 |
| Sum | 19926386.67 |
| Variance | 33506.61887 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 180.35 | 8 | < 0.1% |
| 56.1 | 7 | < 0.1% |
| 204.21 | 7 | < 0.1% |
| 459.24 | 7 | < 0.1% |
| 178.49 | 7 | < 0.1% |
| 367.68 | 7 | < 0.1% |
| 169.27 | 7 | < 0.1% |
| 393.51 | 7 | < 0.1% |
| 195.29 | 7 | < 0.1% |
| 248.41 | 7 | < 0.1% |
| Other values (40491) | 63997 |
| Value | Count | Frequency (%) |
| 15.6 | 1 | |
| 15.75 | 1 | |
| 16.38 | 1 | |
| 16.53 | 1 | |
| 16.76 | 1 | |
| 17.03 | 1 | |
| 17.11 | 1 | |
| 17.21 | 1 | |
| 17.27 | 1 | |
| 17.46 | 1 |
| Value | Count | Frequency (%) |
| 1623.48 | 1 | |
| 1517.15 | 1 | |
| 1495.6 | 1 | |
| 1377.73 | 1 | |
| 1368.66 | 1 | |
| 1359.59 | 1 | |
| 1339.31 | 1 | |
| 1332.98 | 1 | |
| 1319.52 | 1 | |
| 1235.96 | 1 |
| Distinct | 9659 |
|---|---|
| Distinct (%) | 15.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 248.1118607 |
|---|---|
| Minimum | 19.19 |
| Maximum | 576 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 500.7 KiB |
Quantile statistics
| Minimum | 19.19 |
|---|---|
| 5-th percentile | 40.341 |
| Q1 | 132 |
| median | 246.42 |
| Q3 | 360.126 |
| 95-th percentile | 463.904 |
| Maximum | 576 |
| Range | 556.81 |
| Interquartile range (IQR) | 228.126 |
Descriptive statistics
| Standard deviation | 135.2614952 |
|---|---|
| Coefficient of variation (CV) | 0.5451633582 |
| Kurtosis | -1.079802523 |
| Mean | 248.1118607 |
| Median Absolute Deviation (MAD) | 114.024 |
| Skewness | 0.08607470702 |
| Sum | 15896030.69 |
| Variance | 18295.67208 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 428.4 | 42 | 0.1% |
| 393.12 | 40 | 0.1% |
| 280.8 | 36 | 0.1% |
| 342.72 | 35 | 0.1% |
| 205.2 | 34 | 0.1% |
| 181.44 | 34 | 0.1% |
| 383.04 | 33 | 0.1% |
| 336 | 33 | 0.1% |
| 356.4 | 32 | < 0.1% |
| 399.84 | 32 | < 0.1% |
| Other values (9649) | 63717 |
| Value | Count | Frequency (%) |
| 19.19 | 4 | |
| 19.2 | 3 | |
| 19.38 | 2 | |
| 19.392 | 1 | < 0.1% |
| 19.4 | 3 | |
| 19.57 | 3 | |
| 19.584 | 1 | < 0.1% |
| 19.594 | 3 | |
| 19.6 | 2 | |
| 19.76 | 2 |
| Value | Count | Frequency (%) |
| 576 | 3 | < 0.1% |
| 571.2 | 4 | < 0.1% |
| 566.44 | 2 | < 0.1% |
| 566.4 | 6 | |
| 561.68 | 5 | |
| 561.6 | 10 | |
| 556.96 | 2 | < 0.1% |
| 556.92 | 10 | |
| 556.8 | 9 | |
| 552.279 | 5 |
Customer ID
Real number (ℝ≥0)
| Distinct | 22955 |
|---|---|
| Distinct (%) | 35.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15501.70488 |
|---|---|
| Minimum | 1 |
| Maximum | 60000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 500.7 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 706 |
| Q1 | 3664.75 |
| median | 7311 |
| Q3 | 22056.25 |
| 95-th percentile | 58167.65 |
| Maximum | 60000 |
| Range | 59999 |
| Interquartile range (IQR) | 18391.5 |
Descriptive statistics
| Standard deviation | 18264.4499 |
|---|---|
| Coefficient of variation (CV) | 1.178222012 |
| Kurtosis | 0.6252787027 |
| Mean | 15501.70488 |
| Median Absolute Deviation (MAD) | 4505 |
| Skewness | 1.433872799 |
| Sum | 993163228 |
| Variance | 333590130.3 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 8120 | 18 | < 0.1% |
| 8595 | 17 | < 0.1% |
| 7927 | 17 | < 0.1% |
| 6159 | 17 | < 0.1% |
| 7340 | 16 | < 0.1% |
| 8474 | 16 | < 0.1% |
| 8915 | 16 | < 0.1% |
| 7988 | 15 | < 0.1% |
| 7764 | 15 | < 0.1% |
| 8721 | 15 | < 0.1% |
| Other values (22945) | 63906 |
| Value | Count | Frequency (%) |
| 1 | 4 | |
| 2 | 4 | |
| 3 | 6 | |
| 4 | 1 | < 0.1% |
| 5 | 8 | |
| 6 | 5 | |
| 7 | 2 | < 0.1% |
| 8 | 6 | |
| 9 | 5 | |
| 10 | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| 60000 | 4 | |
| 59999 | 2 | |
| 59998 | 3 | |
| 59997 | 2 | |
| 59995 | 2 | |
| 59994 | 3 | |
| 59993 | 1 | < 0.1% |
| 59992 | 3 | |
| 59991 | 2 | |
| 59990 | 2 |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| Card | |
|---|---|
| Cash |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Characters and Unicode
| Total characters | 256272 |
|---|---|
| Distinct characters | 6 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Cash |
|---|---|
| 2nd row | Card |
| 3rd row | Card |
| 4th row | Card |
| 5th row | Card |
| Value | Count | Frequency (%) |
| Card | 38290 | |
| Cash | 25778 |
| Value | Count | Frequency (%) |
| card | 38290 | |
| cash | 25778 |
Most occurring characters
| Value | Count | Frequency (%) |
| C | 64068 | |
| a | 64068 | |
| r | 38290 | |
| d | 38290 | |
| s | 25778 | |
| h | 25778 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 192204 | |
| Uppercase Letter | 64068 | 25.0% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 64068 | |
| r | 38290 | |
| d | 38290 | |
| s | 25778 | |
| h | 25778 |
| Value | Count | Frequency (%) |
| C | 64068 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 256272 |
Most frequent character per script
| Value | Count | Frequency (%) |
| C | 64068 | |
| a | 64068 | |
| r | 38290 | |
| d | 38290 | |
| s | 25778 | |
| h | 25778 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 256272 |
Most frequent character per block
| Value | Count | Frequency (%) |
| C | 64068 | |
| a | 64068 | |
| r | 38290 | |
| d | 38290 | |
| s | 25778 | |
| h | 25778 |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.8 MiB |
| Male | |
|---|---|
| Female |
Length
| Max length | 6 |
|---|---|
| Median length | 4 |
| Mean length | 4.860897796 |
| Min length | 4 |
Characters and Unicode
| Total characters | 311428 |
|---|---|
| Distinct characters | 6 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Male |
|---|---|
| 2nd row | Male |
| 3rd row | Male |
| 4th row | Male |
| 5th row | Male |
| Value | Count | Frequency (%) |
| Male | 36490 | |
| Female | 27578 |
| Value | Count | Frequency (%) |
| male | 36490 | |
| female | 27578 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 91646 | |
| a | 64068 | |
| l | 64068 | |
| M | 36490 | 11.7% |
| F | 27578 | 8.9% |
| m | 27578 | 8.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 247360 | |
| Uppercase Letter | 64068 | 20.6% |
Most frequent character per category
| Value | Count | Frequency (%) |
| e | 91646 | |
| a | 64068 | |
| l | 64068 | |
| m | 27578 | 11.1% |
| Value | Count | Frequency (%) |
| M | 36490 | |
| F | 27578 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 311428 |
Most frequent character per script
| Value | Count | Frequency (%) |
| e | 91646 | |
| a | 64068 | |
| l | 64068 | |
| M | 36490 | 11.7% |
| F | 27578 | 8.9% |
| m | 27578 | 8.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 311428 |
Most frequent character per block
| Value | Count | Frequency (%) |
| e | 91646 | |
| a | 64068 | |
| l | 64068 | |
| M | 36490 | 11.7% |
| F | 27578 | 8.9% |
| m | 27578 | 8.9% |
Age
Real number (ℝ≥0)
| Distinct | 48 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 35.40116751 |
|---|---|
| Minimum | 18 |
| Maximum | 65 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 500.7 KiB |
Quantile statistics
| Minimum | 18 |
|---|---|
| 5-th percentile | 19 |
| Q1 | 25 |
| median | 33 |
| Q3 | 42 |
| 95-th percentile | 61 |
| Maximum | 65 |
| Range | 47 |
| Interquartile range (IQR) | 17 |
Descriptive statistics
| Standard deviation | 12.67668283 |
|---|---|
| Coefficient of variation (CV) | 0.3580865752 |
| Kurtosis | -0.4778097077 |
| Mean | 35.40116751 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 0.6826095716 |
| Sum | 2268082 |
| Variance | 160.6982876 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 23 | 2237 | 3.5% |
| 32 | 2210 | 3.4% |
| 26 | 2210 | 3.4% |
| 19 | 2195 | 3.4% |
| 20 | 2126 | 3.3% |
| 25 | 2113 | 3.3% |
| 39 | 2104 | 3.3% |
| 40 | 2076 | 3.2% |
| 22 | 2069 | 3.2% |
| 34 | 2065 | 3.2% |
| Other values (38) | 42663 |
| Value | Count | Frequency (%) |
| 18 | 1924 | |
| 19 | 2195 | |
| 20 | 2126 | |
| 21 | 1996 | |
| 22 | 2069 | |
| 23 | 2237 | |
| 24 | 1954 | |
| 25 | 2113 | |
| 26 | 2210 | |
| 27 | 2050 |
| Value | Count | Frequency (%) |
| 65 | 632 | |
| 64 | 701 | |
| 63 | 718 | |
| 62 | 674 | |
| 61 | 794 | |
| 60 | 672 | |
| 59 | 717 | |
| 58 | 755 | |
| 57 | 682 | |
| 56 | 633 |
Income (USD/Month)
Real number (ℝ≥0)
| Distinct | 15644 |
|---|---|
| Distinct (%) | 24.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15082.44144 |
|---|---|
| Minimum | 2001 |
| Maximum | 35000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 500.7 KiB |
Quantile statistics
| Minimum | 2001 |
|---|---|
| 5-th percentile | 3241.35 |
| Q1 | 8387 |
| median | 14761.5 |
| Q3 | 21079.25 |
| 95-th percentile | 29704 |
| Maximum | 35000 |
| Range | 32999 |
| Interquartile range (IQR) | 12692.25 |
Descriptive statistics
| Standard deviation | 7996.215513 |
|---|---|
| Coefficient of variation (CV) | 0.5301671845 |
| Kurtosis | -0.6750578341 |
| Mean | 15082.44144 |
| Median Absolute Deviation (MAD) | 6338.5 |
| Skewness | 0.3007725391 |
| Sum | 966301858 |
| Variance | 63939462.52 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 15515 | 33 | 0.1% |
| 15818 | 28 | < 0.1% |
| 5532 | 28 | < 0.1% |
| 18178 | 27 | < 0.1% |
| 6099 | 26 | < 0.1% |
| 16809 | 25 | < 0.1% |
| 20884 | 25 | < 0.1% |
| 14300 | 25 | < 0.1% |
| 7355 | 25 | < 0.1% |
| 16965 | 25 | < 0.1% |
| Other values (15634) | 63801 |
| Value | Count | Frequency (%) |
| 2001 | 1 | < 0.1% |
| 2002 | 2 | < 0.1% |
| 2007 | 11 | |
| 2009 | 1 | < 0.1% |
| 2012 | 7 | |
| 2015 | 1 | < 0.1% |
| 2019 | 2 | < 0.1% |
| 2020 | 4 | < 0.1% |
| 2021 | 3 | < 0.1% |
| 2022 | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| 35000 | 1 | < 0.1% |
| 34995 | 2 | < 0.1% |
| 34989 | 7 | |
| 34985 | 2 | < 0.1% |
| 34984 | 14 | |
| 34983 | 1 | < 0.1% |
| 34979 | 1 | < 0.1% |
| 34973 | 1 | < 0.1% |
| 34972 | 2 | < 0.1% |
| 34967 | 6 |
Population
Real number (ℝ≥0)
| Distinct | 15 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2833514.834 |
|---|---|
| Minimum | 248968 |
| Maximum | 8405837 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 500.7 KiB |
Quantile statistics
| Minimum | 248968 |
|---|---|
| 5-th percentile | 248968 |
| Q1 | 943999 |
| median | 1595037 |
| Q3 | 1955130 |
| 95-th percentile | 8405837 |
| Maximum | 8405837 |
| Range | 8156869 |
| Interquartile range (IQR) | 1011131 |
Descriptive statistics
| Standard deviation | 2985291.418 |
|---|---|
| Coefficient of variation (CV) | 1.053564775 |
| Kurtosis | -0.2407254656 |
| Mean | 2833514.834 |
| Median Absolute Deviation (MAD) | 564852 |
| Skewness | 1.259678092 |
| Sum | 1.815376284 × 1011 |
| Variance | 8.911964847 × 1012 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 1595037 | 19865 | |
| 8405837 | 13967 | |
| 1955130 | 9361 | |
| 248968 | 5186 | 8.1% |
| 1339155 | 2002 | 3.1% |
| 698371 | 1868 | 2.9% |
| 327225 | 1841 | 2.9% |
| 814885 | 1762 | 2.8% |
| 1030185 | 1513 | 2.4% |
| 754233 | 1394 | 2.2% |
| Other values (5) | 5309 | 8.3% |
| Value | Count | Frequency (%) |
| 248968 | 5186 | |
| 327225 | 1841 | 2.9% |
| 542085 | 682 | 1.1% |
| 545776 | 1334 | 2.1% |
| 698371 | 1868 | 2.9% |
| 754233 | 1394 | 2.2% |
| 814885 | 1762 | 2.8% |
| 942908 | 1380 | 2.2% |
| 943999 | 864 | 1.3% |
| 959307 | 1049 | 1.6% |
| Value | Count | Frequency (%) |
| 8405837 | 13967 | |
| 1955130 | 9361 | |
| 1595037 | 19865 | |
| 1339155 | 2002 | 3.1% |
| 1030185 | 1513 | 2.4% |
| 959307 | 1049 | 1.6% |
| 943999 | 864 | 1.3% |
| 942908 | 1380 | 2.2% |
| 814885 | 1762 | 2.8% |
| 754233 | 1394 | 2.2% |
Users
Real number (ℝ≥0)
| Distinct | 15 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 145470.1403 |
|---|---|
| Minimum | 3643 |
| Maximum | 302149 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 500.7 KiB |
Quantile statistics
| Minimum | 3643 |
|---|---|
| 5-th percentile | 9270 |
| Q1 | 80021 |
| median | 144132 |
| Q3 | 164468 |
| 95-th percentile | 302149 |
| Maximum | 302149 |
| Range | 298506 |
| Interquartile range (IQR) | 84447 |
Descriptive statistics
| Standard deviation | 98934.80855 |
|---|---|
| Coefficient of variation (CV) | 0.6801038918 |
| Kurtosis | -0.8910507885 |
| Mean | 145470.1403 |
| Median Absolute Deviation (MAD) | 64111 |
| Skewness | 0.2995650954 |
| Sum | 9319980948 |
| Variance | 9788096343 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 144132 | 19865 | |
| 302149 | 13967 | |
| 164468 | 9361 | |
| 80021 | 5186 | 8.1% |
| 17675 | 2002 | 3.1% |
| 14978 | 1868 | 2.9% |
| 9270 | 1841 | 2.9% |
| 24701 | 1762 | 2.8% |
| 12994 | 1513 | 2.4% |
| 12421 | 1394 | 2.2% |
| Other values (5) | 5309 | 8.3% |
| Value | Count | Frequency (%) |
| 3643 | 682 | 1.1% |
| 6133 | 864 | |
| 7044 | 1334 | |
| 9270 | 1841 | |
| 12421 | 1394 | |
| 12994 | 1513 | |
| 14978 | 1868 | |
| 17675 | 2002 | |
| 22157 | 1380 | |
| 24701 | 1762 |
| Value | Count | Frequency (%) |
| 302149 | 13967 | |
| 164468 | 9361 | |
| 144132 | 19865 | |
| 80021 | 5186 | 8.1% |
| 69995 | 1049 | 1.6% |
| 24701 | 1762 | 2.8% |
| 22157 | 1380 | 2.2% |
| 17675 | 2002 | 3.1% |
| 14978 | 1868 | 2.9% |
| 12994 | 1513 | 2.4% |
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.6 MiB |
| - | |
|---|---|
| Christmas Day | 314 |
| Thanksgiving Day | 146 |
| Memorial Day | 120 |
| Independence Day | 113 |
| Other values (6) | 550 |
Length
| Max length | 37 |
|---|---|
| Median length | 1 |
| Mean length | 1.291892989 |
| Min length | 1 |
Characters and Unicode
| Total characters | 82769 |
|---|---|
| Distinct characters | 40 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
| Value | Count | Frequency (%) |
| - | 62825 | |
| Christmas Day | 314 | 0.5% |
| Thanksgiving Day | 146 | 0.2% |
| Memorial Day | 120 | 0.2% |
| Independence Day | 113 | 0.2% |
| Veterans Day | 105 | 0.2% |
| New Year Day | 103 | 0.2% |
| Presidents Day (Washingtons Birthday) | 100 | 0.2% |
| Martin Luther King Jr. Day | 100 | 0.2% |
| Labor Day | 74 | 0.1% |
| Value | Count | Frequency (%) |
| 62825 | ||
| day | 1243 | 1.9% |
| christmas | 314 | 0.5% |
| thanksgiving | 146 | 0.2% |
| memorial | 120 | 0.2% |
| independence | 113 | 0.2% |
| veterans | 105 | 0.2% |
| new | 103 | 0.2% |
| year | 103 | 0.2% |
| jr | 100 | 0.2% |
| Other values (8) | 742 | 1.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 62825 | |
| a | 2405 | 2.9% |
| 1846 | 2.2% | |
| s | 1347 | 1.6% |
| y | 1343 | 1.6% |
| e | 1288 | 1.6% |
| D | 1243 | 1.5% |
| n | 1236 | 1.5% |
| i | 1226 | 1.5% |
| r | 1216 | 1.5% |
| Other values (30) | 6794 | 8.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 62825 | |
| Lowercase Letter | 14709 | 17.8% |
| Uppercase Letter | 3089 | 3.7% |
| Space Separator | 1846 | 2.2% |
| Open Punctuation | 100 | 0.1% |
| Close Punctuation | 100 | 0.1% |
| Other Punctuation | 100 | 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 2405 | |
| s | 1347 | |
| y | 1343 | |
| e | 1288 | |
| n | 1236 | |
| i | 1226 | |
| r | 1216 | |
| t | 919 | 6.2% |
| h | 760 | 5.2% |
| m | 502 | 3.4% |
| Other values (11) | 2467 |
| Value | Count | Frequency (%) |
| D | 1243 | |
| C | 382 | 12.4% |
| M | 220 | 7.1% |
| L | 174 | 5.6% |
| T | 146 | 4.7% |
| I | 113 | 3.7% |
| V | 105 | 3.4% |
| N | 103 | 3.3% |
| Y | 103 | 3.3% |
| P | 100 | 3.2% |
| Other values (4) | 400 | 12.9% |
| Value | Count | Frequency (%) |
| - | 62825 |
| Value | Count | Frequency (%) |
| 1846 |
| Value | Count | Frequency (%) |
| ( | 100 |
| Value | Count | Frequency (%) |
| ) | 100 |
| Value | Count | Frequency (%) |
| . | 100 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 64971 | |
| Latin | 17798 | 21.5% |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 2405 | |
| s | 1347 | 7.6% |
| y | 1343 | 7.5% |
| e | 1288 | 7.2% |
| D | 1243 | 7.0% |
| n | 1236 | 6.9% |
| i | 1226 | 6.9% |
| r | 1216 | 6.8% |
| t | 919 | 5.2% |
| h | 760 | 4.3% |
| Other values (25) | 4815 |
| Value | Count | Frequency (%) |
| - | 62825 | |
| 1846 | 2.8% | |
| ( | 100 | 0.2% |
| ) | 100 | 0.2% |
| . | 100 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 82769 |
Most frequent character per block
| Value | Count | Frequency (%) |
| - | 62825 | |
| a | 2405 | 2.9% |
| 1846 | 2.2% | |
| s | 1347 | 1.6% |
| y | 1343 | 1.6% |
| e | 1288 | 1.6% |
| D | 1243 | 1.5% |
| n | 1236 | 1.5% |
| i | 1226 | 1.5% |
| r | 1216 | 1.5% |
| Other values (30) | 6794 | 8.2% |
Profit
Real number (ℝ)
| Distinct | 55618 |
|---|---|
| Distinct (%) | 86.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 62.90747295 |
|---|---|
| Minimum | -220.06 |
| Maximum | 1119.48 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 500.7 KiB |
Quantile statistics
| Minimum | -220.06 |
|---|---|
| 5-th percentile | -20.7233 |
| Q1 | 10.3175 |
| median | 40.676 |
| Q3 | 94.55575 |
| 95-th percentile | 220.143 |
| Maximum | 1119.48 |
| Range | 1339.54 |
| Interquartile range (IQR) | 84.23825 |
Descriptive statistics
| Standard deviation | 80.22762043 |
|---|---|
| Coefficient of variation (CV) | 1.275327345 |
| Kurtosis | 7.273451088 |
| Mean | 62.90747295 |
| Median Absolute Deviation (MAD) | 35.988 |
| Skewness | 1.913260551 |
| Sum | 4030355.977 |
| Variance | 6436.47108 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 6.19 | 9 | < 0.1% |
| 5.56 | 8 | < 0.1% |
| 8.34 | 7 | < 0.1% |
| 34.06 | 7 | < 0.1% |
| 18.56 | 7 | < 0.1% |
| -0.22 | 7 | < 0.1% |
| 0.56 | 7 | < 0.1% |
| 13.31 | 7 | < 0.1% |
| 32.31 | 7 | < 0.1% |
| 21.43 | 7 | < 0.1% |
| Other values (55608) | 63995 |
| Value | Count | Frequency (%) |
| -220.06 | 1 | |
| -198.698 | 1 | |
| -168.985 | 1 | |
| -164.04 | 1 | |
| -160.536 | 1 | |
| -153.25 | 1 | |
| -150.38 | 1 | |
| -148.586 | 1 | |
| -147.477 | 1 | |
| -144.68 | 1 |
| Value | Count | Frequency (%) |
| 1119.48 | 1 | |
| 1056.11 | 1 | |
| 1039.08 | 1 | |
| 982.59 | 1 | |
| 971.17 | 1 | |
| 907.92 | 1 | |
| 900.804 | 1 | |
| 868.71 | 1 | |
| 867.04 | 1 | |
| 827.54 | 1 |
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.8 MiB |
| 2017.0 | |
|---|---|
| 2018.0 | |
| 2016.0 |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Characters and Unicode
| Total characters | 384408 |
|---|---|
| Distinct characters | 7 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2016.0 |
|---|---|
| 2nd row | 2016.0 |
| 3rd row | 2016.0 |
| 4th row | 2016.0 |
| 5th row | 2016.0 |
| Value | Count | Frequency (%) |
| 2017.0 | 22946 | |
| 2018.0 | 22135 | |
| 2016.0 | 18987 |
| Value | Count | Frequency (%) |
| 2017.0 | 22946 | |
| 2018.0 | 22135 | |
| 2016.0 | 18987 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 128136 | |
| 2 | 64068 | |
| 1 | 64068 | |
| . | 64068 | |
| 7 | 22946 | 6.0% |
| 8 | 22135 | 5.8% |
| 6 | 18987 | 4.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 320340 | |
| Other Punctuation | 64068 | 16.7% |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 128136 | |
| 2 | 64068 | |
| 1 | 64068 | |
| 7 | 22946 | 7.2% |
| 8 | 22135 | 6.9% |
| 6 | 18987 | 5.9% |
| Value | Count | Frequency (%) |
| . | 64068 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 384408 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 128136 | |
| 2 | 64068 | |
| 1 | 64068 | |
| . | 64068 | |
| 7 | 22946 | 6.0% |
| 8 | 22135 | 5.8% |
| 6 | 18987 | 4.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 384408 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 128136 | |
| 2 | 64068 | |
| 1 | 64068 | |
| . | 64068 | |
| 7 | 22946 | 6.0% |
| 8 | 22135 | 5.8% |
| 6 | 18987 | 4.9% |
Month
Real number (ℝ≥0)
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.8820472 |
|---|---|
| Minimum | 1 |
| Maximum | 12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 500.7 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 6 |
| median | 9 |
| Q3 | 11 |
| 95-th percentile | 12 |
| Maximum | 12 |
| Range | 11 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 3.33981261 |
|---|---|
| Coefficient of variation (CV) | 0.4237240054 |
| Kurtosis | -0.7716730064 |
| Mean | 7.8820472 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -0.5901321009 |
| Sum | 504987 |
| Variance | 11.15434827 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 12 | 9041 | |
| 11 | 8556 | |
| 10 | 8205 | |
| 9 | 7246 | |
| 8 | 5895 | |
| 7 | 4821 | |
| 6 | 4339 | |
| 1 | 3800 | |
| 5 | 3667 | |
| 3 | 3087 | 4.8% |
| Other values (2) | 5411 |
| Value | Count | Frequency (%) |
| 1 | 3800 | |
| 2 | 2433 | 3.8% |
| 3 | 3087 | 4.8% |
| 4 | 2978 | 4.6% |
| 5 | 3667 | |
| 6 | 4339 | |
| 7 | 4821 | |
| 8 | 5895 | |
| 9 | 7246 | |
| 10 | 8205 |
| Value | Count | Frequency (%) |
| 12 | 9041 | |
| 11 | 8556 | |
| 10 | 8205 | |
| 9 | 7246 | |
| 8 | 5895 | |
| 7 | 4821 | |
| 6 | 4339 | |
| 5 | 3667 | |
| 4 | 2978 | 4.6% |
| 3 | 3087 | 4.8% |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.9 MiB |
| Friday | |
|---|---|
| Saturday | |
| Sunday | |
| Thursday | |
| Monday | |
| Other values (2) |
Length
| Max length | 9 |
|---|---|
| Median length | 6 |
| Mean length | 6.987560092 |
| Min length | 6 |
Characters and Unicode
| Total characters | 447679 |
|---|---|
| Distinct characters | 17 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Saturday |
|---|---|
| 2nd row | Saturday |
| 3rd row | Saturday |
| 4th row | Saturday |
| 5th row | Saturday |
| Value | Count | Frequency (%) |
| Friday | 14580 | |
| Saturday | 13738 | |
| Sunday | 12491 | |
| Thursday | 7237 | |
| Monday | 5359 | 8.4% |
| Tuesday | 5334 | 8.3% |
| Wednesday | 5329 | 8.3% |
| Value | Count | Frequency (%) |
| friday | 14580 | |
| saturday | 13738 | |
| sunday | 12491 | |
| thursday | 7237 | |
| monday | 5359 | 8.4% |
| tuesday | 5334 | 8.3% |
| wednesday | 5329 | 8.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 77806 | |
| d | 69397 | |
| y | 64068 | |
| u | 38800 | |
| r | 35555 | |
| S | 26229 | 5.9% |
| n | 23179 | 5.2% |
| s | 17900 | 4.0% |
| e | 15992 | 3.6% |
| F | 14580 | 3.3% |
| Other values (7) | 64173 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 383611 | |
| Uppercase Letter | 64068 | 14.3% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 77806 | |
| d | 69397 | |
| y | 64068 | |
| u | 38800 | |
| r | 35555 | |
| n | 23179 | 6.0% |
| s | 17900 | 4.7% |
| e | 15992 | 4.2% |
| i | 14580 | 3.8% |
| t | 13738 | 3.6% |
| Other values (2) | 12596 | 3.3% |
| Value | Count | Frequency (%) |
| S | 26229 | |
| F | 14580 | |
| T | 12571 | |
| M | 5359 | 8.4% |
| W | 5329 | 8.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 447679 |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 77806 | |
| d | 69397 | |
| y | 64068 | |
| u | 38800 | |
| r | 35555 | |
| S | 26229 | 5.9% |
| n | 23179 | 5.2% |
| s | 17900 | 4.0% |
| e | 15992 | 3.6% |
| F | 14580 | 3.3% |
| Other values (7) | 64173 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 447679 |
Most frequent character per block
| Value | Count | Frequency (%) |
| a | 77806 | |
| d | 69397 | |
| y | 64068 | |
| u | 38800 | |
| r | 35555 | |
| S | 26229 | 5.9% |
| n | 23179 | 5.2% |
| s | 17900 | 4.0% |
| e | 15992 | 3.6% |
| F | 14580 | 3.3% |
| Other values (7) | 64173 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| df_index | Transaction ID | Date of Travel | Company | City | KM Travelled | Price Charged | Cost of Trip | Customer ID | Payment_Mode | Gender | Age | Income (USD/Month) | Population | Users | Holiday | Profit | Year | Month | Day of Week | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 8 | 10000163.0 | 2016-01-02 | Pink Cab | PHOENIX AZ | 4.44 | 71.57 | 48.840 | 22557.0 | Cash | Male | 38.0 | 8808.0 | 943999.0 | 6133.0 | - | 22.730 | 2016.0 | 1.0 | Saturday |
| 1 | 9 | 10000164.0 | 2016-01-02 | Pink Cab | PHOENIX AZ | 8.55 | 114.15 | 89.775 | 22469.0 | Card | Male | 37.0 | 4378.0 | 943999.0 | 6133.0 | - | 24.375 | 2016.0 | 1.0 | Saturday |
| 2 | 14 | 10000149.0 | 2016-01-02 | Pink Cab | NEW YORK NY | 32.64 | 498.60 | 349.248 | 533.0 | Card | Male | 52.0 | 15974.0 | 8405837.0 | 302149.0 | - | 149.352 | 2016.0 | 1.0 | Saturday |
| 3 | 19 | 10000092.0 | 2016-01-02 | Pink Cab | LOS ANGELES CA | 37.76 | 851.25 | 438.016 | 8927.0 | Card | Male | 19.0 | 17197.0 | 1595037.0 | 144132.0 | - | 413.234 | 2016.0 | 1.0 | Saturday |
| 4 | 30 | 10000041.0 | 2016-01-02 | Pink Cab | CHICAGO IL | 35.02 | 598.43 | 406.232 | 4289.0 | Card | Male | 19.0 | 28719.0 | 1955130.0 | 164468.0 | - | 192.198 | 2016.0 | 1.0 | Saturday |
| 5 | 45 | 10000070.0 | 2016-01-02 | Pink Cab | DENVER CO | 7.02 | 61.30 | 82.836 | 30718.0 | Cash | Male | 52.0 | 20255.0 | 754233.0 | 12421.0 | - | -21.536 | 2016.0 | 1.0 | Saturday |
| 6 | 48 | 10000171.0 | 2016-01-02 | Pink Cab | SAN DIEGO CA | 14.28 | 269.15 | 147.084 | 20687.0 | Cash | Male | 39.0 | 8926.0 | 959307.0 | 69995.0 | - | 122.066 | 2016.0 | 1.0 | Saturday |
| 7 | 54 | 10000201.0 | 2016-01-02 | Pink Cab | SAN DIEGO CA | 31.68 | 623.77 | 370.656 | 18490.0 | Card | Male | 24.0 | 10573.0 | 959307.0 | 69995.0 | - | 253.114 | 2016.0 | 1.0 | Saturday |
| 8 | 55 | 10000145.0 | 2016-01-02 | Pink Cab | NEW YORK NY | 2.10 | 37.18 | 21.420 | 502.0 | Cash | Male | 28.0 | 15285.0 | 8405837.0 | 302149.0 | - | 15.760 | 2016.0 | 1.0 | Saturday |
| 9 | 65 | 10000067.0 | 2016-01-02 | Pink Cab | DALLAS TX | 33.32 | 308.58 | 386.512 | 25247.0 | Cash | Male | 26.0 | 24178.0 | 942908.0 | 22157.0 | - | -77.932 | 2016.0 | 1.0 | Saturday |
Last rows
| df_index | Transaction ID | Date of Travel | Company | City | KM Travelled | Price Charged | Cost of Trip | Customer ID | Payment_Mode | Gender | Age | Income (USD/Month) | Population | Users | Holiday | Profit | Year | Month | Day of Week | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 64058 | 359362 | 10433204.0 | 2018-12-31 | Pink Cab | CHICAGO IL | 8.82 | 106.33 | 101.430 | 4162.0 | Cash | Female | 53.0 | 22813.0 | 1955130.0 | 164468.0 | - | 4.900 | 2018.0 | 12.0 | Monday |
| 64059 | 359364 | 10433547.0 | 2018-12-31 | Pink Cab | NEW YORK NY | 27.81 | 417.41 | 333.720 | 1779.0 | Card | Male | 18.0 | 15039.0 | 8405837.0 | 302149.0 | - | 83.690 | 2018.0 | 12.0 | Monday |
| 64060 | 359368 | 10436908.0 | 2018-12-31 | Pink Cab | LOS ANGELES CA | 41.80 | 553.77 | 422.180 | 6564.0 | Cash | Male | 19.0 | 9101.0 | 1595037.0 | 144132.0 | - | 131.590 | 2018.0 | 12.0 | Monday |
| 64061 | 359369 | 10433309.0 | 2018-12-31 | Pink Cab | LOS ANGELES CA | 10.70 | 128.00 | 119.840 | 8175.0 | Card | Male | 24.0 | 12571.0 | 1595037.0 | 144132.0 | - | 8.160 | 2018.0 | 12.0 | Monday |
| 64062 | 359384 | 10436745.0 | 2018-12-31 | Pink Cab | CHICAGO IL | 23.98 | 307.86 | 270.974 | 4948.0 | Card | Male | 65.0 | 2361.0 | 1955130.0 | 164468.0 | - | 36.886 | 2018.0 | 12.0 | Monday |
| 64063 | 359385 | 10433590.0 | 2018-12-31 | Pink Cab | ORANGE COUNTY | 6.36 | 81.86 | 67.416 | 17591.0 | Card | Male | 30.0 | 24699.0 | 1030185.0 | 12994.0 | - | 14.444 | 2018.0 | 12.0 | Monday |
| 64064 | 359386 | 10433494.0 | 2018-12-31 | Pink Cab | NEW YORK NY | 37.12 | 600.00 | 408.320 | 1677.0 | Card | Male | 57.0 | 12975.0 | 8405837.0 | 302149.0 | - | 191.680 | 2018.0 | 12.0 | Monday |
| 64065 | 359388 | 10433435.0 | 2018-12-31 | Pink Cab | MIAMI FL | 2.30 | 29.53 | 23.920 | 9774.0 | Cash | Female | 33.0 | 14322.0 | 1339155.0 | 17675.0 | - | 5.610 | 2018.0 | 12.0 | Monday |
| 64066 | 359389 | 10436696.0 | 2018-12-31 | Pink Cab | BOSTON MA | 27.55 | 377.85 | 330.600 | 60000.0 | Cash | Female | 27.0 | 20303.0 | 248968.0 | 80021.0 | - | 47.250 | 2018.0 | 12.0 | Monday |
| 64067 | 359390 | 10433418.0 | 2018-12-31 | Pink Cab | LOS ANGELES CA | 2.34 | 29.21 | 25.038 | 7650.0 | Card | Female | 32.0 | 17629.0 | 1595037.0 | 144132.0 | - | 4.172 | 2018.0 | 12.0 | Monday |